-
Notifications
You must be signed in to change notification settings - Fork 381
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure that generated GUIDs conform to the RFC 4122 standard. #969
Conversation
RFC 4122 describes the format of a UUID (or as Microsoft calls them, GUID) as xxxxxxxx-xxxx-Mxxx-Nxxx-xxxxxxxxxxxx, where M is the UUID version and some bits of N denote the UUID variant. The GUID generator in this repository sets these nibbles to random values, making the generated UUIDs incompatible with the RFC. This may cause problems later on, as there is a chance future UUID implementations may misbehave on nonconformant UUIDs or not accept them at all. For example, UUID libraries may assume that a UUID with version 0 is a nil uuid and compare all such UUIDs to be equal to 00000000-0000-0000-0000-000000000000 and each other. This commit fixes this by setting the UUID version and variant properly in generated UUIDs. The same number of bytes is still used for the ticks, while there is now one byte less randomness in each UUID. However, together with the ticks in 100 ns precision, the chance of a collision is still negligible. The version used is 4, for a UUID based on randomness; we hide the fact that most of the UUID is not truly random but based on time. The variant is 0b10xx for RFC 4122. This commit also adds tests for the new properties as well as the old properties presumed by the existing implementation, such as that generated GUIDs are sorted on time and unique, non-empty GUIDs are generated.
Is possible to reuse Sequential GUIDs code in SequentialGuidValueGenerator from EF Core instead? Further detail about what the algorithm does at https://stackoverflow.com/a/47682820/310601. Also see algorithm to generate ordered GUIDs (COMBs) at GuidCombGenerator for NHibernate. |
@caleblloyd I saw that you did an earlier Sequential GUIDs PR, do you have any comments? |
Thank you for your PR! Though I don't share your concerns about future incompatibilities of the current implementation, I think it is a good idea to advertise conformance with Variant 1, Version 4 of RFC 4122 and I don't see any downsides to your approach. |
@mguinness Interestingly, SQL Server 14 does not seem to implement its
This is an example of how a sequence of generated Guids look like in MSSQL:
On the other hand, just executing the following line on MSSQL will keep all bytes in order: insert into dbo.UuidTable (UuidColumn) values ('3E1B727F-5C6B-4692-BB0D-4F195B883EBB');
And according to the public override Guid Next(EntityEntry entry)
{
var guidBytes = Guid.NewGuid().ToByteArray();
var counterBytes = BitConverter.GetBytes(Interlocked.Increment(ref _counter));
if (!BitConverter.IsLittleEndian)
{
Array.Reverse(counterBytes);
}
guidBytes[08] = counterBytes[1];
guidBytes[09] = counterBytes[0];
guidBytes[10] = counterBytes[7];
guidBytes[11] = counterBytes[6];
guidBytes[12] = counterBytes[5];
guidBytes[13] = counterBytes[4];
guidBytes[14] = counterBytes[3];
guidBytes[15] = counterBytes[2];
return new Guid(guidBytes);
} @caleblloyd's optimized implementation for both, binary based and character base columns, is still preserved with this PR. It's just being fixed to conform to the standard. |
@smitpatel @bricelam @AndriySvyryd @roji @maumar Interesting discussion here on GUIDs. Most specifically, that SQL Server may have changed sequential GUID generation since we last looked at it. Should we react in EF? |
It makes sense to conform to the RFC. Thanks for adding the tests. I think they need to be moved out of the Functional tests and into the Unit tests at |
@ajcvickers yeah, we should definitely consider changing on our side... |
@roji Can you file an issue outlining what you think we should do? |
At a lower level |
@mguinness Good links! They also lead to UNRAVELING THE MYSTERIES OF NEWSEQUENTIALID with a practical analysis and examples of the differences between the individual According to the link, something similar to the following pseudo code is being done:
The proposed solution by @sgielen (based on the previous work by @caleblloyd) is similar to |
- Make TreatTinyAsBoolean virtual as well. - Move tests to EFCore.MySql.Tests.
Thanks for your feedback, all! I've addressed relevant proposed changes in the last commit. |
Issue on the EF Core repo to investigate behavior on the SQL Server side: dotnet/efcore#19124 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation is great and checks out!
Only minor naming issues that need to be addressed.
We will merge right away after they are resolved.
test/EFCore.MySql.Tests/ValueGeneration/Internal/MysqlSequentialGuidValueGeneratorTest.cs
Outdated
Show resolved
Hide resolved
test/EFCore.MySql.Tests/ValueGeneration/Internal/MysqlSequentialGuidValueGeneratorTest.cs
Outdated
Show resolved
Hide resolved
test/EFCore.MySql.Tests/ValueGeneration/Internal/MysqlSequentialGuidValueGeneratorTest.cs
Outdated
Show resolved
Hide resolved
test/EFCore.MySql.Tests/ValueGeneration/Internal/MysqlSequentialGuidValueGeneratorTest.cs
Outdated
Show resolved
Hide resolved
test/EFCore.MySql.Tests/ValueGeneration/Internal/MysqlSequentialGuidValueGeneratorTest.cs
Outdated
Show resolved
Hide resolved
test/EFCore.MySql.Tests/ValueGeneration/Internal/MysqlSequentialGuidValueGeneratorTest.cs
Outdated
Show resolved
Hide resolved
test/EFCore.MySql.Tests/ValueGeneration/Internal/MysqlSequentialGuidValueGeneratorTest.cs
Outdated
Show resolved
Hide resolved
test/EFCore.MySql.Tests/ValueGeneration/Internal/MysqlSequentialGuidValueGeneratorTest.cs
Outdated
Show resolved
Hide resolved
test/EFCore.MySql.Tests/ValueGeneration/Internal/MysqlSequentialGuidValueGeneratorTest.cs
Outdated
Show resolved
Hide resolved
test/EFCore.MySql.Tests/ValueGeneration/Internal/MysqlSequentialGuidValueGeneratorTest.cs
Outdated
Show resolved
Hide resolved
Thank you @lauxjpn for your feedback, I've addressed the naming issues you mentioned. |
@sgielen Thank you for your contribution! |
RFC 4122 describes the format of a UUID (or as Microsoft calls them, GUID) as
xxxxxxxx-xxxx-Mxxx-Nxxx-xxxxxxxxxxxx, where M is the UUID version and some bits
of N denote the UUID variant. The GUID generator in this repository sets these
nibbles to random values, making the generated UUIDs incompatible with the RFC.
This may cause problems later on, as there is a chance future UUID
implementations may misbehave on nonconformant UUIDs or not accept them at all.
For example, UUID libraries may assume that a UUID with version 0 is a nil uuid
and compare all such UUIDs to be equal to 00000000-0000-0000-0000-000000000000
and each other.
This commit fixes this by setting the UUID version and variant properly in
generated UUIDs. The same number of bytes is still used for the ticks, while
there is now one byte less randomness in each UUID. However, together with the
ticks in 100 ns precision, the chance of a collision is still negligible. The
version used is 4, for a UUID based on randomness; we hide the fact that most
of the UUID is not truly random but based on time. The variant is 0b10xx for
RFC 4122.
This commit also adds tests for the new properties as well as the old
properties presumed by the existing implementation, such as that generated
GUIDs are sorted on time and unique, non-empty GUIDs are generated.